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1. INTRODUCTION 

To detect long-term climate trends, it is es- 
sential to produce long-term and consistent data 
sets from a variety of different satellite platforms. 
With current global cloud climatology data sets, 
such as the International Satellite Cloud Clima- 
tology Experiment (ISCCP) or CLAVR (Clouds 
from Advanced Very High Resolution Radiome- 
ter), one of the first processing steps is to deter- 
mine whether an imager pixel is obstructed be- 
tween the satellite and the surface, i.e . determine 
a cloud "mask.” A cloud mask is essential to 
studies monitoring changes over ocean, land, or 
snow-covered surfaces As part of the Earth Ob- 
serving System (EOS) program, a series of plat- 
forms will be flown beginning in 1997 with the 
Tropical Rainfall Measurement Mission (TRMM) 
and subsequently the EOS-AM and EOS-PM 
platforms in following years. The cloud imager 
on TRMM is the Visible/Infrared Sensor (VIRS). 
while the Moderate Resolution Imaging Spcctro- 
radiometer (MODIS) is the imager on the EOS 
platforms. To be useful for long term studies, a 
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cloud masking algorithm should produce consis- 
tent results between existing (AVHRR) data, and 
future VIRS and MODIS data. The present work 
outlines both existing and proposed approaches to 
detecting cloud using multispectral narrowband 
radiance data 

Clouds general!) arc characterized by higher 
albedos and lower temperatures than the underly- 
ing surface. However, there are numerous condi- 
tions when this characterization is inappropriate, 
most notably over snow and ice. Of the cloud 
types, cirrus, stratocumulus and cumulus are the 
most difficult to detect. Other problems arise 
when analyzing data from sun-glint areas over 
oceans or lakes, over deserts, or over regions 
containing numerous fires and smoke The cloud 
mask effort builds upon operational experience of 
several groups that will now be discussed. 

2. HERITAGE ALGORITHMS 

The CERES cloud masking algorithm (Baum 
el a/. 1994) will rely heavily upon a rich heritage 
of both NASA and NOAA experience with global 
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data analysis. Initial algorithm design will incor- 
porate the approaches used by isccr 
(International Satellite Cloud Climatology Proj- 
ect) (Rossow and Carder 1993), CLAVR (Clouds 
from AVHRR) (Stowe et al . , 1991). and SER 
CAA (Support of Environmental R ^ uir ^n 
for Cloud Analysis and Archive) The ISC 
..algorithms are based upon two channels, one in 
the visible wavelength region and one in t e in- 
frared. The CLAVR approach uses all five 
ne | s of the AVHRR instrument. The CLAVK 
multispectral threshold approach incorporates 
narrowband channel difference and ratio tests 
including dynamic threshold specification wi 
clear-sky radiation statistics. 

The SERCAA (Gustafson et al. 1994) algo- 
rithm is operational at the Phillips Laboratory 
Hanscom Air Force Base, and u»s aU 
AVHRR radiometric channels. The StRLAA 
sponsored jointly by the Department of Defense, 
Department of Energy , and Environmental Pro- 
tection Agency Strategic Environmental Researc 

and Development Program 

The International Satellite Cloud Climato ogy 
Protect (ISCCP) cloud masking algorithm is de- 
scribed by Rossow (1989. 1993). Rossow ei al. 
(1989) and Seze and Rossow (1991a). Data are 
used from the narrowband VIS (0.6 micron) and 
the IR (11 micron) channels The ISCCP a go- 
nthrn is based on the premise that the observed 
VIS and IR radiances are caused by onl\ t\vo 
types of conditions, ’cloudy’ and 'clear 1 , and that 
the ranges of radiances and their variability that 
are associated with these two conditions do not 
overlap (Rossow and Garder 1993). As a result, 
the algorithm is based upon thresholds, where a 
pixel is classified as "cloudy" only if at least one 
radiance value is distinct from the inferred 
"clear" value by an amount larger than the uncer- 
tainty in that "clear" value. The uncertainty can 
be caused both by measurement errors and by 
natural variability. The "threshold" for cloud de- 
tection is the magnitude of the uncertainty in t e 
clear radiance estimates. This algorithm is con- 
structed to be "cloud-conservative," minimizing 
false cloud detections but missing clouds that 

resemble clear conditions. 

The NOAA CLAVR algorithm (Phase I) uses 
all five channels of AVHRR to derive a global 
cloud mask (Stowe ef al, 1991). It examines 
multispectral information, channel differences, 
and spatial differences and then employs a series 
of sequential decision tree tests. Cloudfree, 
mixed (variable cloudy) and cloudy regions are 


identified for 2x2 GAC pixel arrays If all four 
pixels in the array fail all the cloud tests, then c 
array is labeled as cloud-free (0% cloudy), if all 
four pixels satisfy just one of the cloud tests, t en 
the array is labeled as 100% cloudy. If 1 to 3 
pixels satisfy a cloud test, then the array is a- 
beled as mixed and assigned an arbitrary value of 
50% cloudy. If all four pixels of a mixed or 
cloudy array satisfy a clear-restoral test (required 
for snow/ice. ocean specular reflection, and bnght 
desert surfaces) then the pixel array is re- 
classified as "rcstored-clcar" (0% cloudy ). The set 
of cloud tests is subdivided into daytime ocean 
scenes davtime land scenes, nighttime ocean 
scenes and nighttime land scenes. Subsequent 
phases of the CLAVR cloud mask, now under 
development, will be incorporated as modifica- 
tions become available. 

SERCAA, Support of Environmental Re- 
quirements for a Cloud Analy sis and Archive is 
the prototype for the US Air Force’s new global 
cloud analysis model. SERCAA makes use of a 
number of algorithms tailored to sensors on both 
the polar orbiting and geostationary meteorologi- 
cal satellite platforms. The resulting cloud masks 
are determined at sensor pixel resolution rather 
than a common grid. These algorithms have been 
extensively tested at various global locations. 

Unfortunately, existing approaches have hmi- 
tations. notablv in detecting cloud shadows, cloud 
over snow- or ice covered surfaces, clouds in sun- 
glint areas, fires and smoke from biomass burn- 
ing, and dust storms over deserts. An improved 
global cloud mask appropriate for the EOS time- 
frame is discussed, and examples will be shown 
from application of the cloud mask to existing 
AVHRR data. 


3. NEW METHODOLOGY 

Two new classification methods are used in 
this study. Both methods use a set of pairwise 
decisions to classify samples. Most classification 
methods utilize a small number of features due to 
their multidimensional nature. For those meth- 
ods. it is not feasible to use more than 10-20 tea- 

The feature vector used in this study consists 
of 164 spectral and textural and pseudo-textural 
features. The spectral features are created from 
the original 5 channels of the AVHRR data and a 
sixth channel which is the reflectance of channel 
3 Spectral ratios, differences, arctangents and 
various other functions are computed Grey level 


8 th CONF. ON SAT. MET. & OCEAN. 


471 



difference vector (GLDV) textural features are 
computed over a 7x7 mask. Pseudo textures are 
computed over a 3x3 neighborhood. 

The pairwise classifiers select a subset of fea- 
tures from the feature vector. The selected fea- 
tures are optimal for distinguishing between pairs 
of classes. This reduces the size of the final fea- 
ture vector to approximately 20 - 30 features. 
The final size of the feature vector is determined 
by the number of tests performed for each pair of 
classes. 

3.1 Training 

The classifiers were trained using the Satellite 
Imagery Visualization System (SIVIS). SIVIS 
allows the user to visualize large satellite images 
and select samples from the data. Representative 
samples of 23 different classes were selected from 
over 40 scenes. The scenes are chosen from three 
main climate regimes; polar, Middle-Eastern 
desert and South American rainforest dunng the 
burning season. A separate classifier is con- 
structed for each regime. 

3.2 Paired Rule Classifier 

The paired rule classifier takes the 164 ele- 
ment feature vector and creates a new feature 
vector consisting of the original 164 features and 
all possible ratios of the form A/B. This results 
in a feature vector with 13694 features. 

Next, the divergence for each pair of classes is 
computed for each feature. For a given feature F, 
the divergence between class i and class j is de- 
fined as: 

DIVCFXj = | m, - nij | / ( s, + s } ) , 

where m, and m, are the means for classes i and j, 
and St and Sj are the standard deviations for those 
classes. 

A list of features, sorted by decreasing diver- 
gence, is constructed for each pair of classes. In 
this study, the 5 features with highest divergence 
are chosen for each class pair. A threshold then 
is determined for each of the chosen features. For 
a given feature F, the threshold T is used as a test 
for discriminating between the two classes i and j 
as follows: 

T = m, + s, ( nij - m, ) / ( s, + Sj ) . 


If F < T, then the test returns class i, if F > T the 
test returns class j. Each class has 5 tests in this 
study. A histogram tabulates the number of tests 
satisfied by each class. After all tests for each 
class have been completed, the class with the 
highest histogram count is chosen. 

3.3 Paired Histogram Classifier 

The paired histogram classifier takes the 164 
element feature vector and constructs histograms 
for each pair of classes. For feature F, two histo- 
grams If and Jp are created for classes i and j. 
The histogram ranges are scaled to accommodate 
the minimum and maximum values of both 
classes and discretized into 256 bins. The histo- 
gram values for each class are normalized by the 
number of samples in the class. These paired 
histograms are analyzed and sorted, based upon 
overlap and divergence. Overlap, O, is defined as 
follows: 

0(F)., i = £ If(x)Jf(x) 

X — I 

Divergence is defined above. 

The three features with the lowest overlap and 
highest divergence for each pair of classes are 
chosen. For a chosen feature F, the paired histo- 
grams If and J F are used for discriminating be- 
tween class i and j. 

The histograms can be used as discriminators 
in a variety of ways. The method chosen for this 
study considers the histograms as range specifiers 
for the class pairs. Each of the 256 bins of the 
normalized histograms Ip and J F are compared 
and the following rules are applied: 

iflp(x) >J F (x) then if I F (x) < J F (x) then 

I F (x) = 1 I F (x) = 1 

Jp(x) - 0 Jp(x) = l 

where x = 1.256 

The paired histograms I F and J F now represent 
ranges in feature F which correspond to classes i 
and j respectively. The three features used for 
each class pair produce three pairs of histograms 
which are used for classification. This process is 
repeated for each pairwise combination of classes. 

The following procedure is used to classify a 
sample. First, calculate the 164 feature vector for 
the sample. Next, compute the histogram bin for 
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each set of paired histograms Retrieve the value 
at each computed histogram bin and increment a 
voting histogram by that value. Finally, examine 
the voting histogram and assign the class with 
the highest value 

4. RESULTS 

The paired rule and paired histogram methods 
were both used on three areas of interest; polar, 
desert, and South America. These areas were 
chosen because they are particularly difficult ar- 
eas to classify. Results are preliminary at this 
time, but both classifiers are performing well for 
all of the scenes analyzed so far In particular, 
the South American classifier is able to detect 
smoke from biomass burning. The desert classi- 
fier can detect desert, dust storms and some 
sunglint areas in the ocean The polar classifier 
is able to differentiate between clouds and ice. 
We are very encouraged by our results w ith these 
difficult classes. 

Color photographs of the classification results 
will be presented at the conference 

5. FUTURE WORK 

The classification algorithms arc undergoing 
continuing revision and enhancement. Work is 
also continuing on developing better features for 
classification. 

The training sample database is constantly 
expanding to include more representative sam- 
ples of each class and more comprehensive cov- 
erage of the earth. 
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